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Abstract 

This paper computes the sensing capacity of a sensor network, with sensors of limited range, sensing a two- 
dimensional Markov random field, by modeling the sensing operation as an encoder Sensor observations are 
dependent across sensors, and the sensor network output across different states of the environment is neither 
|0 ' identically nor independently distributed. Using a random coding argument, based on the theory of types, we 

prove a lower bound on the sensing capacity of the network, which characterizes the ability of the sensor network 
' to distinguish among environments with Markov structure, to within a desired accuracy. 

I. Introduction 

. We investigate how spatial Markov structure in the environment affects the number of sensors required to sense 
that environment to within a desired accuracy. We explore this relationship in the context of discrete sensor network 
applications such as distributed detection and classification. The number of sensors required to achieve a desired 
performance level depends the characteristics of the environment (e.g. target sparsity, likely target configurations, 
target contiguity), the constituent sensors (e.g. noise, range, sensing function), and the resource constraints at 
sensor nodes (e.g. power, computation, communications). Resource constraints such as communications and power 
c/2 are important to consider in the design of sensor networks due to the limitations they impose on, among other 
things, network lifetime and sampling rate. See, for example, [1], [2], [3] for a discussion on the effects of resource 
constraints on sensor networks. However, even if these resource constraints were eliminated, many basic questions 
about the theoretical design limitations of sensor networks are not yet adequately addressed. The sensing capabilities 
of the sensors, the spatial characteristics of the environment being sensed, and the required accuracy of the sensing 
task impose sharp limitations on the number of sensors required to achieve a desired performance level. We elucidate 
■ this purely sensing-based limitation, by demonstrating a lower bound on the minimum number of sensors required 
O ! to achieve a desired sensing performance, given the sensing capabilities of the sensors and a spatial Markov model 
of the environment. External constraints, such as power, communication, bandwidth, and computation are not 
considered in this paper. 

We model the presence/absence of targets in a two-dimensional grid as a Markov random field [4], and the sensor 
network as a 'channel encoder' (Figure 0. This 'encoder' maps the grid of targets into a vector of sensor outputs, 
which corresponds to a "codeword." These sensor outputs are then corrupted by noise. The decoder observes this 
noisy codeword and provides an estimate of the spatial target configuration. Viewing the sensor network as a channel 
encoder allows us to use ideas from Shannon coding theory. However the messages do not necessarily occur with 
equal probability, unlike messages in classical channel codes. In addition, as we will show, the "codebook" obtained 
has codewords which are neither independent nor identical. These differences require a novel analysis and a novel 
concept of 'sensing capacity' C{D). The distortion D is the maximum tolerable fraction of spatial positions which 
may be erroneously sensed. For a given D, C{D) represents the maximum ratio of the total number of target 
positions under observation to the number of sensors, such that below this ratio, there exist sensor networks whose 
average probability of error goes to zero as the number of possible target positions and sensors goes to infinity. 

In previous work [5], we introduced the concept of a sensing capacity. We extended this work in [6] to account 
for arbitrary sensing functions and localized sensing of a one-dimensional target vector, with i.i.d. targets. In this 
paper we explore the effect of Markov structure in a two-dimensional environment on the sensing capacity, as 
occurs in several practical applications (e.g. robotic demining and prospecting, distributed surveillance). We model 
the environment as a Markov random field, and show an extension of the theory of types to include Markov random 
fields. Section Ull introduces and motivates our sensor network model. Section |ffl] states a lower bound on sensing 
capacity for the model. Illustrative calculations of the sensing capacity appear in Section |^ 
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Fig. 1. Sensor network modeled as a channel encoder. 




Fig. 2. Sensor network model with k — 5,n — 2,c — 1. 

II. Sensor Network Model 

We denote random variables and functions by upper-case letters, and instantiations or constants by lower-case 
letters. Bold-font denotes vectors, log(-) has base-2. Sets are denoted using calligraphic script. D{P\\Q) denotes 
the KuUback-Leibler distance and H{P) denotes entropy. 

We consider the problem of sensing discrete two dimensional environments with spatial structure. Examples 
include camera networks that localize people in a room, seismic sensor networks that localize moving objects, 
minefield mapping, and soil mapping. There exists a large body of work in distributed detection [7], but we are 
not aware of the existence of any 'sensing capacity' results. [8] introduced the idea of viewing sensor networks as 
encoders, and used algebraic coding theory to design highly structured sensor networks, but no notion of capacity 
was discussed. 

The model we present attempts to abstractly characterize the discrete sensor network applications listed above. 
Figure El shows an example of our sensor network model. There are k"^ discrete spatial positions that need to be 
sensed in a /c x /c grid. Each discrete position may contain no target or one target, though extensions to non-binary 
targets is straightforward. Thus, the target configuration is represented by a /c^-bit 'target field' /. The possible 
target fields are denoted /i, i G {1, . . . , 2^^}. We say that 'a certain / has occurred' if that field represents the 
true spatial target configuration. Target fields occur with probability Ppif) and are assumed to be distributed as 
a pairwise Markov random field (also referred to as an auto-model) [4], a widely used model (e.g. distributed 
detection, image processing) that captures spatial dependencies while still allowing for efficient algorithms. This 
model differs from the equiprobable i.i.d. target distribution explored in our previous work, and allows one to 
model environments with structure such as target sparsity, likely target configurations, and spatial contiguity among 
targets. We remark that the methods used in this paper can be directly extended to more complex Markov field 
models (besides pairwise Markov), at the price of more cumbersome notation. A pairwise Markov random field 
(Figure |2l is modeled as a graph, where each target position corresponds to a node. The subscript h indexes 
the set of possible grid locations. The set of four grid blocks directly adjacent to F^, which are neighbors of F^ 
in the graph, are written as Mh- We assume circular boundary conditions; i.e. the targets on the boundaries are 
adjacent to the opposite boundary. We assume that all / have positive probability, and that given its neighbors, the 
probability of a target is independent of the remaining targets. According to the Hammersley-Clifford theorem, a 
Markov random field that obeys these two properties is distributed as a Gibbs distribution [4]. A Gibbs distribution 
is written as a normalized product of positive functions over the cliques in the graph of the Markov random field. 
In our pairwise Markov random field there are two types of cliques: single nodes {Fh} with associated function 
■j^Pp, and pairwise cliques {{Fh,Fy) : v G Mh} with associated function Pf\F'- The constant W is defined as 
W = J2t€{o 1}-' ^pitb) Y[t=i -fV|F'(*5|*r)- Thus, WC havc the following Gibbs distribution for /, 

PFif) = Z-^llW-^PFifh) n PF\F'{fh\fv) (1) 
h v£j\fh 



where Z is a nomialization constant. 

The sensor network has n identical sensors. Sensor £ located at grid block senses (i.e. is connected to in the 
graph) a set of contiguous target positions within a Euclidean distance c of its location (though our approach can 
be extended to other sensor coverage models). Circular boundary conditions, discussed above, are assumed. Figure 
121 depicts sensors with range c = 1. Each sensor outputs a value x £ X that is an arbitrary function of the targets 
which it senses, x = "^{{fv ■ v G Sc^h}), where Sc^h is the coverage of a sensor located at grid block Fh with 
range c. Since the number of targets sensed by a target depends only on the sensor range, we write the number of 
targets in a sensor's coverage as \Sc\- One example of a sensing function is a weighted sum of the targets. This 
function corresponds to a seismic sensor, which senses the weighted sum of target vibrations. The 'ideal output 
vector' of the sensor network x depends on the sensor connections, sensing function, and on the target field /. 
However, we assume that each sensor output y € J' is corrupted by noise, so that the conditional p.m.f. PY\x{y\^) 
determines the observed output. Since the sensors are identical, Py\x is the same for all the sensors. Further, we 
assume that the noise is independent in the sensors, so that the 'sensor output vector' y relates to the ideal output 
X as PY\x{y\x) = n"=i PY\x{ye\xi)- Observing the output y, a decoder (described below) must determine which 
of the 2^^ target fields fi occurred. 

We define the sensor network S{k'^,n,c) as a graph (Figure |2li with connections between n sensors and the 
k"^ spatial positions, and the noise corrupted observations of the ideal sensor outputs. We assume a simple model 
for randomly constructing such sensor networks, where each sensor chooses a region of Euclidean radius c (as 
constructed above) with equal probability among the set of possible regions of radius c. This would occur, for 
example, if sensors were randomly dropped on a field, or robots moved randomly over a region. 

III. Sensor Network Capacity Theorem 

For a sensor network, randomly generated as explained above, the ideal output a; is a function of the sensor 
network instantiation S{k'^,n,c), the sensing function and the occurring target field /. Denote Xi as the 
random vector which occurs when fi is the target field (where Xi is random because of the random generation 
of the sensor network S{k'^,n,c)). Since each sensor independently forms connections to a subset of targets, 
PXi{xi) = YYe=i Px,{xie)- However, it is important to note that when not conditioned on the occurrence of a 
specific target field fi, the sensor outputs are not independent. Further, we also note that the random vectors Xi 
and Xj, associated with a pair of target fields fi and fj respectively, are not independent, since the sensor network 
configuration produces a dependency between them (i.e. similar target fields are likely to produce a similar sensor 
network output). Thus, the 'codewords' {Xi, i = 1, 2, . . . , 2'^ } of the sensor network (one corresponding to each 
fi) are non-identical and dependent on each other, unlike channel codes in classical information theory. Further 
the messages {fi} to which these 'codewords' correspond are not equally likely, necessitating a different analysis. 

Given the noise conupted sensor network output y, we estimate the target field / which generated this noisy output 
by using a decoder g{y). We allow the decoder a distortion of d E [0, 1]. Given D^iifi, fj) is the Hamming distance 
between two target fields, given that the tolerable distortion region of fi is Dj = {j : ^D^{fi,fj) < d}, and 
given that fi occurred, the probability of error is Pe,i,s = Pr[error|i, s, Xi, y] = FT[g{y) Vili, s, Xi, y\. Averaging 
Pe,i,s over all sensor networks, we write the average error probability, given fi occurred, as Pg,* = E[Pe,i,s]- We 
use average error probability P^ = Pe^iPpifi) as our error metric. 

We define the 'rate' of the sensor network as the ratio of target positions to sensors, R = Given a tolerable 
distortion D, we call R achievable if the sequence of sensors networks S{ [ni?] , n, c) satisfies ^ as n — > cxd. 
The sensing capacity of the sensor network is defined as C{D) = maxR over achievable R. 

The main result of this paper is to show that the sensing capacity C{D) of the sensor network model presented 
in this paper is non-zero, and to characterize it as a function of environmental structure Pp, noise Py|x> sensing 
function "f, and sensor range c. The proof broadly follows the proof of channel capacity provided by Gallager [9], 
by analyzing a union bound of pair-wise error probabilities, averaged over randomly generated sensor networks. 
However, it differs from [9] in several important ways. One primary difference arises due to our 'encoder' (i.e. 
sensor network). Rather than randomly generating pairwise independent codewords as in the Shannon capacity 
proof, our encoder corresponds to a randomly generated sensor network. Given this encoder (sensor network), the 
codewords are dependent on each other and non-identic ally distributed. To overcome this compUcation, we observe 



that since each sensor in our network randomly chooses a set of contiguous targets, we can use the method of types 
[10] to group the exponential number of pair-wise error probability terms into a polynomial number of terms in 
order to prove convergence of error probability. A second primary difference is that we analyze two-dimensional 
messages that are not equally likely. Thus, rather than using a maximum likelihood decoder in our proof we use 
a maximum a posteriori decoder. Further, the statement of the main result requires the extension of the existing 
definition of higher order types [10] to accommodate two-dimensional fields. In our proof, we will use two kinds 
of types. 

The field type 4>: Since the probability distribution of a pairwise Markov random field has a factorized form, 
depending only on quintuplets of values as shown in (0, we can rewrite the probability of a Markov random field 
as a product over the set of possible quintuplets. Each term in the product will have a degree equal to the number 
of times that quintuplet of values occurred in the field. We refer to the vector of normalized counts of the number 
of times each quintuplet occurred in a field as the field type 4>^. (p^ is a normalized thirty-two dimensional vector 
for binary fields. Q can be rewritten in terms of as follows, 

^ {t}e{o,i}s 

The sensor types 7 and A: For a sensor located randomly in the target field, the probability of a sensor producing 
a value depends on the number of target patterns that correspond to the sensor's range, and thus, can be written 
as a function of the frequency with which each pattern occurs in the field. The sensor type 7^ is a vector that 
corresponds to the normalized counts over the set of possible target configurations in the sensor's field of view in 
a field /j. For a sensor of range c, 7^ is a 2^^"^ dimensional vector, where each entry in the vector 7^ corresponds 
to the frequency of occurrence of one of the possible \Sc\ bit patterns. 

Since each sensor independently chooses a set of contiguous spatial positions to sense, the distribution of its 
ideal output Xi (which is sensed when the i*^ target field fi occurs) depends only on the type 7 of fi. i.e., for a 
sensing function 'f, a range c, and a target field /j of type 7^, Pxi{xi) = P~'''"{xi) = n"=i P'^^i^u) for all fi 
of type 7j [5]. 

Next, we note that for sensor of range c the conditional probability Pxj\Xi depends on the joint sensor type X of 
the i*'* and j*^ target fields fi, fj. X is the matrix of X(^t^,,,tis \){ui...uis 1)' fraction of positions in fi, fj where 
fi has a target pattern ti . . . while fj has a target pattern ui . . . We denote the set of all joint sensor 
types for sensors of range c observing a target field of area k"^, as Afc2(c). Since the output of each sensor depends 
only on the contiguous region of targets which it senses, Pxj\Xi depends only on A [5]. Thus, Px^lXii^jl^i) = 
YYe=iP^ixje\xie) for all i,j of the same joint type A. 

The field types cj) and the sensor types 7 of a field / must be consistent with each other. Due to the circular 
boundary conditions of our Markov random field graph, the marginals of types are precisely equal to types over 
smaller sets. Thus when c > 1, can be obtained precisely by marginalizing 7, while for c = 7 can be obtained 
by marginalizing cj). For c = 1 the two types are identical. Further, A also allows computation of A(i)(o) and A(o){i)- 
These latter quantities correspond to the number of grid locations where field i has a target and field j does not, 
and vice versa. 

We specify two probability distributions which we will utilize in the main theorem. The first is the joint 
distribution of the ideal output Xi when fi occurs and the noise corrupted version y of Xi. i.e., PxiY{xi,y) = 
Wj.=iPXiY{xii,yi) = Y\j=iPxX^ii)PY\x{yi\xii)- The second distribution is the joint distribution of the ideal 
output Xi corresponding to fi and the noise corrupted output y generated by the occurrence of a different target field 

fj. We can write this joint distribution as Qx^y (^i' v) = DLi Qxlvi^ie^Ve) = IYe=i T.aex Px, {xie)Px, \x, {xj = 
a\xi£)PY\x{ye\xj = a). Note that Xi,Y are dependent here, although Y was produced by Xj because of the 
dependence of Xi,Xj. This is unlike Shannon codes, where the codewords are independent. 

Since each sensor in the sensor network depends only on the targets in the contiguous spatial region which it ob- 
serves, Pxivixi, y) depends only on the sensor type 7 of fi. Thus, we write PxiY{xi, y) = n"=i y (^«^' Vf) 
where P^yi^i^y) = P'^ {xi)PY\x{y\xi)- Similarly, (^^^^(a;^, y) depends only on the joint sensor type A of fi, fj 
and can be written as lYe=iQxYi^ii^yi) where Q^^y(xi,y) = Y^aex = '^\^i)PY\x{y\xj = a). 

We are now ready to state the main theorem of this paper. 



Theorem 1 ( Sensing Capacity for pairwise MRF, c > 1): The sensing capacity at distortion D for target field 
distribution Pp satisfies, 

C(D) > Clb{D) = mill mill V x^yW^x^y) 

^ ' - ^ ' -y,GT(0-) A DENOM ^ ' 

A(o)(i)+A(i)(o)>-D 

where DENOM = H{\) - H{-fi) + - D{<f)j\\^PF Ut=i Pf\F') - H{(t)j)), where the sensors have range 

c > 1, and where 'yi,lj are obtained by marginalizing A G Afc2(c). Here, T(^*) consists of the set of 7 that 
marginalize to the typical 4>* (the 4>^ such that D{4>^\\^PfY^=iPf\F') = 0). 

Proof: We assume a MAP decoder for a fixed sensor network (i.e. fixed and known /j's and X^-'s); 5map(2/) = 
argmaxj -PF|y(/jl2/) argmax^ PY\x{y\^j)PF{fj)- For this decoder, we consider Pg = Pe,iPF{fi), where 
Pe^i is averaged over the random sensor networks. As argued earlier, Ppifi) = PFi^f^i), and thus we can write 
Pf. = J2cj) ^e,(t>-^F{4>)'^{4>) where a{4>) corresponds to the number of fields fi of field type (p^. The quantity 
PF{(t>)ci{cj)) decays exponentially for non-typical cj), and goes to one for the typical cf), as k goes to infinity. Thus 
the average error probability is dominated by the probability of error for the typical field type 0*. Note that Ppitpi) 
is bounded as follows. 

Thus, the typical field type (p* equals ^Pf Y\t=i Pf\F'- We bound Pe,j for a field fi of typical field type 4>*. For 
large k, this bound will, given the above arguments, bound the average error probability Pg- 

To bound Pr [error [ i, a^i, y] we define events Aij = {xj : PY\x{y\xj)PF{fj) > PY\x{y\xi)PF{fi) \ i,Xi,y}. 
Since decoding to j Vi results in error, 

Pr[error\i,Xi,y]<P{Uj^v,Aij) < ^ Pi^ij) (6) 
We proceed to bound P{Aij). For any Sij > 0, 

DfA \ \^ D ^I^^Y^D f I APY\x{y\Xj)PF{fj)y^' 

Using ^ and Q in (|5ll. 

The bound ^ has an exponential number of terms. However, it was argued earlier that in our sensor network, 
PxA^i) = P"''''"(£c) depends only on the sensor type7j of the i^^ target field, while Px.|Xi(' 

V,\Xi) = P^^^(x,\Xi) 

depends on the joint sensor type A of the i*^ and j^^ target fields. Since we have circular boundary conditions and 
c > 1, 7j and 7^ can be marginalized to compute cj)^ and cj)j precisely. It was also shown that Ppifi) = PF{4>i)- 
Thus, we can rewrite ^ by grouping terms according to A. 

V- p ,1 SPY\x{y\Xj)PF{fj)r^ 



,0^.x^,x^ iPy^\x{y\-i)PF{m-^ 
.km {P^\x{y\^i)PFmy^ 



(9) 



where Si{D) is the set of joint sensor types that result in an error, i.e., 

Si{D) = {A : A € Afc2(c), A(o)(i) + A(i)(o) > D, li,t^...t^s,^ = E \ti-t\s^\)(ui...u^s^o} (10) 

{Ml...U|Sd} 



and where we choose Sij = s\ for all {i,j} of joint sensor type A. Here /3{i, A, k) is the number of fields fj that 
have a joint type A with respect to fi. A, k) is bounded as, 

f3{i,X,k)< 2'=^(^W-^(^.)) (11) 

Combining equations (Ell,©, (II U . and using the fact that we are bounding a probability, the following bound holds 
for px G [0, 1] and sx = j^- 

1 

Using the independence of sensor outputs conditional on the target vector, the joint p.m.f.s can be simplified as 
below, 

P'*'ia,){Y, P\a,\a,)PYixib\aj)^r^y (12) 

We define the following quantity. 

E{px,\) = -log( j;P^'(a,)Py|x(6|ai)^(^ P^(a,|ai)Py|x(6|a,)^)''^) (13) 

Since the number of joint sensor types A is upper bounded by (A;^ + l)l'^'=l^ fc^ = \nR\, and using (©, (fT2b is 
bounded as, 

Pei < 2""(~°i(")+^'-(-^'^)),£;r(i?,D)= mill mill max E(px,X) - PxR{H(X) - H(-f A 

■yieT{ct)*)X&S-y.{D)0<Px<l 



1 11^ 

l + P\ l + Px J \Y -i-i. ' J 



where 7j G T{cj}*) consists of the set of sensor types that marginalize to the typical field type cf)* , and S-y^{D) is 
as in ( fTUI l. with 7^. Note that oi(n) ^ as n — > co, so we have not included it in the error exponent Er{R, D). 
Observing that £'(0, A) = V A, we let px go to zero, rather than optimizing it, thus resulting in a lower bound on 
Er{R, D). In the above expression, this implies that in order for R to be achievable ^^^^'^^ — R(^H{X) — H{^j) + 
H{(f>*) — P>{4>j\\wPF^r=iRF\F') — H{4>j)) must be positive for all 7, A, even as px 0. But this implies 
that the derivative of E{px, A) with respect to px at = must be greater than R{H{\) — H{'-fi) + H{cf)*) — 
D{4>^\\^PFY{t=iPF\F') - H{cl)j)). It can be easily shown that, dE{pxA)/dpx\p^^^ = D{P].^y\\Q\^y). Using 
this derivative in the analysis above, and relaxing the conditions A G Afc2 (c) by dropping the restriction that target 
fields are restricted to area k"^ in the definition ( fTUI l of S~^, {D) (thus, weakening the bound), we see that the sensor 
network can achieve any rate R bounded as below. 

R < mm mm (14) 

~-Y,eT(0*) A DENOM 

A(o)(i)+A(i)(o)>D 

where DENOM = H{\) - H{j,) + H{(f)*) - D{cl)j\\^PF Ut=i Pf\F') - H{ct)j). Therefore the Right Hand Side 
is a lower bound on C{D). ■ 
For the case of c = 0, the proof has one primary difference. Since the field type cf) can be marginalized to 
compute the sensor types 7, all the target fields are grouped according to cj). We let /x be the joint field type of 
target fields fi (with field type cf)*) and fj. Using these definitions we can write the sensing capacity theorem for 
the case of c = as follows. 
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Fig. 3. Clb(D) curves for environments with different probability distributions (e.g. higher p implies higher target sparsity). 



Theorem 2 (Sensing Capacity for pairwise MRF, c 
distribution Pp satisfies, 



Oj; The sensing capacity at distortion D for target field 



C{D) > Clb{D) 
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where DENOM = H{fi) — D{(j)j\\-^PpYl^^^ Pp^p,) — H{cf)j), where 4>* corresponds to the typical type, and 
where cf)*, (f'jni' ^^id ^ obtained by marginalizing the joint field type fi. 



IV. Capacity bound examples 

We compute the capacity bound Clb{D) for environments with probabilistic models of the form Pp = [p (l—p)] 
and Pp\p> = [p (1 — p); — p) p] where p G [0, 1]. In Figure |3j we demonstrate the effect of structure in the 
environment on Clb{D) by varying p. p = 0.5 corresponds to an unstructured environment (all / equally likely), 
and increasing values of p correspond to increasing spatial structure (e.g. increasing target sparsity). We assume that 
the sensors have range c = (i.e. they sense only one target) and that the sensing function ^ is the identity function. 
The sensor noise model assumes that the sensor's output is flipped with probability 0.1. Figure |3l demonstrates that 
Clb{D) increases for more structured environments (i.e. fewer sensors are needed as p increases). 
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